Conceptual complexity and the bias/variance tradeoff.
نویسندگان
چکیده
In this paper we propose that the conventional dichotomy between exemplar-based and prototype-based models of concept learning is helpfully viewed as an instance of what is known in the statistical learning literature as the bias/variance tradeoff. The bias/variance tradeoff can be thought of as a sliding scale that modulates how closely any learning procedure adheres to its training data. At one end of the scale (high variance), models can entertain very complex hypotheses, allowing them to fit a wide variety of data very closely--but as a result can generalize poorly, a phenomenon called overfitting. At the other end of the scale (high bias), models make relatively simple and inflexible assumptions, and as a result may fit the data poorly, called underfitting. Exemplar and prototype models of category formation are at opposite ends of this scale: prototype models are highly biased, in that they assume a simple, standard conceptual form (the prototype), while exemplar models have very little bias but high variance, allowing them to fit virtually any combination of training data. We investigated human learners' position on this spectrum by confronting them with category structures at variable levels of intrinsic complexity, ranging from simple prototype-like categories to much more complex multimodal ones. The results show that human learners adopt an intermediate point on the bias/variance continuum, inconsistent with either of the poles occupied by most conventional approaches. We present a simple model that adjusts (regularizes) the complexity of its hypotheses in order to suit the training data, which fits the experimental data better than representative exemplar and prototype models.
منابع مشابه
LSTD on Sparse Spaces
Efficient model selection and value function approximation are tricky tasks in reinforcement learning (RL), when dealing with large feature spaces. Even in batch settings, when the number of observed trajectories is small and the feature set is high-dimensional, there is little hope that we can learn a good value function directly based on all the features. To get better convergence and handle ...
متن کاملThe Bias-Variance Tradeoff and the Randomized GACV
We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in 'soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tra...
متن کاملThe Use of Relevance to Evaluate Learning Biases
This paper describes Probabilistic Bias Evaluation (PBE), a method for evaluating learning biases formally analyzing the tradeoff between the expected accuracy and complexity of alternative biases. Intelligent agents must filter out irrelevant aspects of the environment, in order to minimize the costs of learning. In PBE, probabilistic background knowledge about relevance is used to compute exp...
متن کاملBias-variance analysis in estimating true query model for information retrieval
The estimation of query model is an important task in language modeling (LM) approaches to information retrieval (IR). The ideal estimation is expected to be not only effective in terms of high mean retrieval performance over all queries, but also stable in terms of low variance of retrieval performance across different queries. In practice, however, improving effectiveness can sacrifice stabil...
متن کاملBias-Variance Techniques for Monte Carlo Optimization: Cross-validation for the CE Method
In this paper, we examine the CE method in the broad context of Monte Carlo Optimization (MCO) [Ermoliev and Norkin, 1998, Robert and Casella, 2004] and Parametric Learning (PL), a type of machine learning. A well-known overarching principle used to improve the performance of many PL algorithms is the bias-variance tradeoff [Wolpert, 1997]. This tradeoff has been used to improve PL algorithms r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Cognition
دوره 118 1 شماره
صفحات -
تاریخ انتشار 2011